Picture for Qianqian Xie

Qianqian Xie

TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation

Add code
Jun 01, 2026
Viaarxiv icon

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Add code
Jun 01, 2026
Viaarxiv icon

DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Add code
Apr 16, 2026
Viaarxiv icon

SiMing-Bench: Evaluating Procedural Correctness from Continuous Interactions in Clinical Skill Videos

Add code
Apr 10, 2026
Viaarxiv icon

TaxPraBen: A Scalable Benchmark for Structured Evaluation of LLMs in Chinese Real-World Tax Practice

Add code
Apr 10, 2026
Viaarxiv icon

Appear2Meaning: A Cross-Cultural Benchmark for Structured Cultural Metadata Inference from Images

Add code
Apr 08, 2026
Viaarxiv icon

Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals

Add code
Mar 03, 2026
Viaarxiv icon

EHRNavigator: A Multi-Agent System for Patient-Level Clinical Question Answering over Heterogeneous Electronic Health Records

Add code
Jan 15, 2026
Viaarxiv icon

RAAR: Retrieval Augmented Agentic Reasoning for Cross-Domain Misinformation Detection

Add code
Jan 08, 2026
Viaarxiv icon

MisSpans: Fine-Grained False Span Identification in Cross-Domain Fake News

Add code
Jan 08, 2026
Viaarxiv icon